Back

Human Genetics and Genomics Advances

Elsevier BV

Preprints posted in the last 7 days, ranked by how well they match Human Genetics and Genomics Advances's content profile, based on 70 papers previously published here. The average preprint has a 0.03% match score for this journal, so anything above that is already an above-average fit.

1
Transcriptome-Wide Alternative Splicing Analysis Implicates Complex Events in Bipolar Disorder

Martinez-Jimenez, M.; Garcia-Ortiz, I.; Romero-Miguel, D.; Kavanagh, T.; Marshall, L. L.; Bello Sousa, R. A.; Sanchez Alonso, S.; Alvarez Garcia, R.; Benavente Lopez, S.; Di Stasio, E.; Schofield, P. R.; Baca-Garcia, E.; Mitchell, P. B.; Cooper, A. A.; Fullerton, J. M.; Toma, C.

2026-04-21 genetic and genomic medicine 10.64898/2026.04.19.26351209 medRxiv
Top 0.1%
3.6%
Show abstract

Alternative-splicing events (ASE) increase transcriptomic variability and play key roles in biological functions. The contribution of ASE to bipolar disorder (BD) remains largely unexplored. We performed a Transcriptome-Wide Alternative-Splicing Analysis (TWASA) to identify ASEs and genes potentially involved in BD. The study comprised 635 individuals: a discovery sample (DS) of 31 individuals from eight multiplex BD families (16 BD cases; 15 unaffected relatives), and a replication sample (RS) of 604 subjects (372 BD cases; 232 controls). Sequencing was conducted on RNA from lymphoblastoid cell lines (DS) and whole blood (RS). TWASA was performed using VAST-TOOLS (VT), rMATS (RM), and MAJIQ/MOCCASIN (MCC). Gene-set association analyses of genes containing ASEs were performed across six psychiatric disorders. Novel ASE (nASE) were investigated in the DS using FRASER. Limited gene overlap was observed across TWASA tools. MCC identified 2,031 complex ASEs involving 1,508 genes, showing the strongest genetic association with BD across psychiatric phenotypes. Prioritization of MCC-identified ASE genes yielded 441 candidates, including DOCK2 as top candidate from the DS. Replication was obtained for 98 genes, five with an identical ASE, and four (RBM26, QKI, ANKRD36, and TATDN2) showing a concordant percentage-spliced-in direction with the DS. Finally, 578 nASE were identified in the DS, with no evidence of familial segregation or differences in ASE types. This first TWASA in BD reveals tool-specific variability, complex ASE for genes specifically associated with BD, and novel candidate genes for BD. Alternative transcript isoform abundance may represent a mechanism contributing to BD pathophysiology.

2
Novel Genetic Risk Loci for Pancreatic Ductal Adenocarcinoma Identified in a Genome-wide Study of African Ancestry Individuals

Vergara, C.; Ni, Z.; Zhong, J.; McKean, D.; Connelly, K. E.; Antwi, S. O.; Arslan, A. A.; Bracci, P. M.; Du, M.; Gallinger, S.; Genkinger, J.; Haiman, C. A.; Hassan, M.; Hung, R. J.; Huff, C.; Kooperberg, C.; Kastrinos, F.; LeMarchand, L.; Lee, W.; Lynch, S. M.; Moore, S. C.; Oberg, A. L.; Park, M. A.; Permuth, J. B.; Risch, H. A.; Scheet, P.; Schwartz, A.; Shu, X.-O.; Stolzenberg-Solomon, R. Z.; Wolpin, B. M.; Zheng, W.; Albanes, D.; Andreotti, G.; Bamlet, W. R.; Beane-Freeman, L.; Berndt, S. I.; Brennan, P.; Buring, J. E.; Cabrera-Castro, N.; Campa, D.; Canzian, F.; Chanock, S. J.; Chen, Y.;

2026-04-22 genetic and genomic medicine 10.64898/2026.04.21.26351329 medRxiv
Top 0.4%
1.3%
Show abstract

Pancreatic cancer disproportionately affects Black individuals in the United States, but they have limited representation in genetic studies of pancreatic ductal adenocarcinoma (PDAC). To address this gap, we performed admixture mapping and genome-wide association analysis (GWAS) in genetically inferred African ancestry individuals (1,030 cases and 889 controls). Admixture mapping identified three regions with a significantly higher proportion of African ancestry in cases compared to controls (5q33.3, 10p1, 22q12.3). GWAS identified a genome-wide significant association at 5p15.33 (CLPTM1L, rs383009:T>C, T Allele Frequency=0.51, OR:1.45, P value=1.24x10-8), a locus previously associated with PDAC. Known loci at 5p15.33, 7q32.3, 8q24.21 and 7q25.1 also replicated (P value <0.01). Multi-ancestral fine-mapping identified two potential causal SNPs (rs3830069 and rs2735940) at 5p15.33. Collectively these findings identified novel PDAC risk loci and expanded our understanding of this deadly cancer in underrepresented populations, emphasizing the multifactorial nature of PDAC risk including inherited genetic and non-genetic factors. Statement of SignificanceTo understand how genetic variation contributes to PDAC risk in Black people in North American, we studied individuals of genetically-inferred African ancestry. We identified novel risk loci and differences in the contribution of known loci. This demonstrates that ancestry-informed genetic analyses improve our understanding of PDAC risk and enhances discovery.

3
Multi-ancestral GWAS with the VA Million Veteran Program enables functional interpretation of rheumatoid arthritis alleles

Sakaue, S.; Yang, D.; Zhang, H.; Posner, D.; Rodriguez, Z.; Love, Z.; Cui, J.; Budu-Aggrey, A.; Ho, Y.-L.; Costa, L.; Monach, P.; Huang, S.; Ishigaki, K.; Melley, C.; Tanukonda, V.; Sangar, R.; Maripuri, M.; Sweet, S. M.; Panickan, V.; McDermott, G.; Hanberg, J. S.; Riley, T.; Laufer, V.; Okada, Y.; Scott, I.; Bridges, S. L.; Baker, J.; VA Million Veteran Program, ; Wilson, P. W.; Gaziano, J. M.; Hong, C.; Verma, A.; Cho, K.; Huffman, J. E.; Cai, T.; Raychaudhuri, S.; Liao, K. P.

2026-04-23 genetic and genomic medicine 10.64898/2026.04.22.26351423 medRxiv
Top 0.5%
1.2%
Show abstract

Rheumatoid arthritis (RA) is a heritable and common autoimmune condition. To date, most genetic associations were derived from individuals with either European or East Asian ancestries. Here, we applied a multimodal automated phenotyping strategy to define RA and performed a genome-wide association study (GWAS) of RA in the Million Veteran Program (MVP), including underrepresented African American (AFR) and Admixed American (AMR) populations. Meta-analyses with previous RA cohorts identified 152 autosomal genome-wide significant loci, of which 31 were novel. Inclusion of multi-ancestry data dramatically improved fine-mapping resolution. Functional characterization of these loci using single-cell transcriptomic and chromatin data suggested new RA genes such as CHD7 and CD247. We identified underappreciated functional roles of fine-grained immune cell states other than T cells, such as B cell and myeloid cell states. We observed that multi-ancestry polygenic risk scores using our data demonstrated better predictive ability, especially for AFR and AMR populations.

4
Comprehensive Exome Sequencing in Swedish Patients with Spontaneous Coronary Artery Dissection

Gunnarsson, C.; Ellegard, R.; Ahsberg, J.; huda, s.; Andersson, J.; Dworeck, C. F.; Glaser, N.; Erlinge, D.; Loghman, H.; Johnston, N.; Mannila, M.; Pagonis, C.; Ravn-Fischer, A.; Rydberg, E.; Welen Schef, K.; Tornvall, P.; Sederholm Lawesson, S.; Swahn, E. E.

2026-04-24 genetic and genomic medicine 10.64898/2026.04.22.26351535 medRxiv
Top 0.6%
0.9%
Show abstract

Abstract Background Spontaneous coronary artery dissection (SCAD) is a well-recognised cause of acute coronary syndrome particularly among women without conventional cardiovascular risk factors. Increasing evidence indicates a genetic contribution; however, the underlying genetic architecture of SCAD remains insufficiently understood. Objective The aim of this study was to assess the prevalence of rare variants in previously reported SCAD associated genes and to explore the potential presence of novel genetic alterations in well-characterised Swedish patients with SCAD. Methods The study comprised 201 patients enrolled in SweSCAD, a national project examining the clinical characteristics, aetiology, and outcomes of SCAD. All individuals had a confirmed diagnosis based on invasive coronary angiography. Comprehensive exome sequencing was performed to identify rare variants contributing to disease susceptibility. Results Genetic variants that have been associated with SCAD according to current clinical genetics practice for variant reporting were identified in approximately 4 % of patients. In addition, rare potentially relevant variants were detected in almost 60 % of patients in genes associated with vascular integrity and vascular remodelling. Conclusion This study supports SCAD as a genetically complex arteriopathy, driven by rare high?impact variants together with broader polygenic susceptibility. Variants in collagen, vascular extracellular matrix, and oestrogen?responsive pathways provide biologically plausible links to female?predominant disease. Although the diagnostic yield of clearly actionable variants is modest, these findings support broader genomic evaluation beyond overt syndromic presentations and highlight the need for larger integrative genomic and functional studies to refine risk stratification and management.

5
Machine Learning Prediction of Disease Trajectories for Children with Juvenile Idiopathic Arthritis

Lee, S.; Davidian, M.; Natter, M. D.; Reeve, B. B.; Schanberg, L. E.; Belkin, E.; Chang, M.-L.; Kimura, Y.; Ong, M.-S.

2026-04-20 rheumatology 10.64898/2026.04.18.26351165 medRxiv
Top 0.6%
0.9%
Show abstract

BackgroundDespite advances in therapy, optimal management of juvenile idiopathic arthritis (JIA) remains challenging. The ability to predict disease progression in JIA can improve personalized treatment decisions, but few reliable clinical predictors have been identified. We developed machine learning approaches to predict disease trajectories in children with JIA. MethodsUsing data from the Childhood Arthritis and Rheumatology Research Alliance (CARRA) Registry (years 2015-2024), we developed machine learning models to predict attainment of inactive disease in children with non-systemic JIA. We applied Dynamic Bayesian Networks (DBN) to model temporal dependencies and causal relationships, and Convolutional Neural Networks (CNN) to capture complex non-linear patterns. Model input included demographic factors, longitudinal clinical factors, and medication use in the preceding 12 months. FindingsA total of 8,093 participants were included. When tested on an independent test cohort, both DBN (AUC:0.76; precision:0.73; recall:0.83; F1-score:0.78; accuracy:0.71) and CNN (AUC:0.76; precision:0.71; recall:0.63; F1-score:0.67; accuracy:0.70) models achieved comparable performance in predicting inactive disease. Disease activity levels in the preceding 12 months, presence of enthesitis and uveitis were the strongest predictors. Causal relationships captured in the DBN model revealed suboptimal care patterns, likely shaped by insurance constraints and a predominantly reactive approach to JIA management. InterpretationOur study demonstrates that machine learning approaches can predict disease trajectories in JIA with good discriminative performance. Unlike prior studies that predict outcomes at single timepoints, our models are the first to predict inactive disease longitudinally. However, suboptimal care patterns in retrospective data limit models capacity to learn treatment-outcome relationships, underscoring critical opportunities to improve JIA care and the need for prospective comparative studies to better inform prediction models. FundingPatient-Centered Outcomes Research Institute (PCORI) Award (ME-2022C2-25573-IC). RESEARCH IN CONTEXT Evidence before this studyNumerous studies have sought to identify clinical predictors of JIA progression and outcomes. However, few reliable predictors have emerged and existing prediction models demonstrate limited performance. As a result, our ability to personalize treatment decisions based on individual risk of severe disease course remains limited. Added value of this studyWe developed novel machine learning models that predict individualized disease trajectories in children with polyarticular and oligoarticular JIA using data from their preceding 12-month clinical course. These models demonstrated strong discriminative performance and outperformed previously published machine learning approaches in JIA. Unlike prior studies limited to single time-point predictions, our models are the first to predict inactive disease longitudinally, enabling a patient-specific projection of disease progression over time. Importantly, our findings also bright to light patterns of suboptimal care, likely driven by insurance constraints and a reactive treatment paradigm, underscoring critical opportunities to improve JIA management. Implications of all the available evidenceOur models have the potential to support clinical decision-making by enabling early identification of children with JIA at risk for unfavorable disease trajectories. In addition, the suboptimal care patterns and systems-level barriers identified through our analyses highlight priority areas for quality improvement initiatives and policy interventions to reduce gaps in JIA care delivery.

6
From GWAS to drug: A framework for drug candidate prioritisation using a gene expression signature matching approach

Chauquet, S.; Jiang, J.-C.; Barker, L. F.; Hunter, Z. L.; Singh, G.; Wray, N. R.; McRae, A. F.; Shah, S.

2026-04-24 genetic and genomic medicine 10.64898/2026.04.22.26349470 medRxiv
Top 0.7%
0.8%
Show abstract

Drug targets supported by human genetic evidence have significantly higher approval rates, making genome-wide association studies a valuable resource for drug candidate prioritisation. Transcriptome-wide association study signature-matching is an emerging in silico approach that integrates GWAS data with expression quantitative trait loci to generate a disease gene expression signature, which is then compared against drug perturbation databases such as the Connectivity Map. Despite recent adoption, there is no consensus on optimal methodology. Here, we systematically benchmark key parameters, including TWAS method, eQTL tissue model, similarity metric, gene set size, and CMap cell line, using LDL cholesterol, familial combined hyperlipidemia, and asthma as proof-of-concept traits. We demonstrate that while TWAS signature-matching can successfully prioritise known first-line treatments, performance is highly sensitive to parameter choice; for instance, the selection of the cell line used for drug signatures alone can dramatically alter drug prioritisation. Based on these findings, we propose a best-practice framework for robust, genetically-informed drug prioritisation using TWAS signature-matching.

7
Diminished sex hormone levels influence the risk of skewed X chromosome inactivation

Roberts, A. L.; Osterdahl, M. F.; Sahoo, A.; Pickles, J.; Franklin-Cheung, C.; Wadge, S.; Mohamoud, N. A.; Morea, A.; Amar, A.; Morris, D. L.; Vyse, T. J.; Steves, C. J.; Small, K. S.

2026-04-22 genetic and genomic medicine 10.64898/2026.04.20.26351303 medRxiv
Top 0.7%
0.8%
Show abstract

BackgroundX chromosome inactivation (XCI) is the mechanism which randomly silences one X chromosome to equalise gene expression between 46, XX females and 46, XY males. Though XCI is expected to result in a random pattern of mosaicism across tissues, some females display a significantly unbalanced ratio in immune cells, termed XCI-skew, in which [&ge;]75% of cells have the same X inactivated. XCI-skew is associated with adverse health outcomes and its prevalence increases with age - particularly after midlife - yet the specific risk factors have yet to be identified. The menopausal transition, which is driven by profound shifts in sex hormone levels, has significant impact on chronic disease risk yet the molecular and cellular effects are incompletely understood. We hypothesised that the menopausal transition may impact XCI-skew. MethodsUsing XCI data measured in blood-derived DNA from 1,395 females from the TwinsUK population cohort, along with questionnaires, genetic data, and sex hormone measures, we carried out a cross-sectional study to assess the impact of the menopausal transition and sex hormones on XCI-skew. ResultsWe demonstrate that early menopause (<45yrs) is associated with increased risk of XCI-skew. In subset analyses across those who had a surgically induced or natural menopause, we find the association restricted to those who underwent a surgical menopause. We next identify a low polygenic score (PGS) for testosterone levels is significantly associated with XCI-skew, which we replicate in an independent dataset (n=149), while a PGS for age at natural menopause is not associated. Finally, using longitudinal measures across two time points spanning [~]18 years we show XCI-skew is a stable cellular phenotype that typically increases over time. DiscussionThese data represent the first environmental and genetic risk factors of XCI-skew, both of which implicate endogenous sex hormone levels, particularly testosterone. We propose XCI-skew may have clinical relevance in postmenopausal females.

8
THRB splice site variants lead to exon 4 skipping and TRβ1 gain-of-function syndrome

Hones, G. S.; Liao, X.-H.; Mahler, E. A.; Herrmann, P.; Eckstein, A.; Fuhrer, D.; Castillo, J. M.; Chiang, J.; Vincent, A. L.; Weiss, R. E.; Dumitrescu, A. M.; Refetoff, S.; Moeller, L. C.

2026-04-22 endocrinology 10.64898/2026.04.15.26349265 medRxiv
Top 0.8%
0.7%
Show abstract

BackgroundHeterozygous c.283+1G>A and c.283G>A variants in the THRB gene, encoding for thyroid hormone receptor (TR){beta}1 and {beta}2, lead to autosomal dominant macular dystrophy (ADMD). We report the detailed clinical characterization of two first-degree relatives with ADMD, heterozygous for THRB c.283+1G>A, and an unrelated ADMD patient with a novel variant, c.283G>C. The genomic and molecular consequences of both variants were studied. MethodsgDNA and mRNA were obtained from leukocytes. Clinical characterization included biochemistry, bone density and body composition, ECG, echocardiography, ultrasound, audiometry and color-vision. In vitro assays investigated TR function and DNA binding. ResultsThe patients manifested no resistance to thyroid hormone beta (RTH{beta}) and had normal FT4 and TSH. Detailed studies in two patients showed no goiter, tachycardia, hypercholesterinemia or hepatic steatosis. Hearing was not impaired. Both had impaired color vision and reduced bone density. RT-PCR from all three patients revealed skipping of exon 4 exclusive to TR{beta}1, producing a deletion of 87 amino acids in the N-terminal domain (TR{beta}1{Delta}NTD). In vitro, DNA-binding affinity of TR{beta}1{Delta}NTD to DR4-TRE with or without RXR was comparable to TR{beta}1WT. Surprisingly, TR{beta}1{Delta}NTD was transcriptionally twice more active than TR{beta}1WT with a similar EC50 for T3, demonstrating gain-of-function of TR{beta}1{Delta}NTD. THRA expression in leukocytes was increased by 3-fold compared to unrelated controls and different from RTH{beta} patients. ConclusionThese THRB splice site variants produce TR{beta}1 exon 4 skipping, resulting in a gain-of-function mutant, TR{beta}1{Delta}NTD. This explains the dominant ADMD phenotype devoid of RTH{beta} and suggests a TR{beta}1 gain-of-function syndrome.

9
Comparative fine-mapping of breast cancer susceptibility loci using summary statistics methods and multinomial regression

O'Mahony, D. G.; Beasley, J.; Zanti, M.; Dennis, J.; Dutta, D.; Kraft, P.; Kristensen, V.; Chenevix-Trench, G.; Easton, D. F.; Michailidou, K.

2026-04-22 epidemiology 10.64898/2026.04.21.26351364 medRxiv
Top 0.8%
0.7%
Show abstract

Summary statistics fine-mapping methods offer advantages over classical methods, including avoiding data-sharing constraints and improved modelling of correlated variables and sparse effects. However, its performance has not been comprehensively evaluated in breast cancer using real-world data. Previous multinomial stepwise regression (MNR) fine-mapping analyses for breast cancer identified 196 credible sets. Here, we apply summary statistics fine-mapping, compare methods, and assess parameters influencing performance. Using summary statistics from the Breast Cancer Association Consortium, we compared finiMOM, SuSiE, and FINEMAP to published MNR results across 129 regions. Performance was assessed by recall using in-sample and out-of-sample LD. Discordant credible sets were examined for technical factors, and target genes were defined using the INQUISIT pipeline. SuSiE showed the closest agreement with MNR. Results varied across regions depending on the assumed number of causal variants (L), with higher values reducing recall and no single L maximising performance. At optimal L per region, SuSiE identified 8,192 CCVs in 244 credible sets, with recall of 88%, 86%, and 72% for overall, ER-positive, and ER-negative breast cancer. Thirty MNR sets were missed. Discordance was partially explained by allele flips, imputation quality, and array heterogeneity. Fifty-two MNR-identified genes, including BRCA2, WNT7B and CREBBP were not recovered, while additional candidate genes were identified. Using out-of-sample LD reduced recall by 3% but identified novel variants. Fine-mapping results vary across methods, and no single approach is sufficient. The choice of L strongly influences results, and combining analytical approaches with functional validation can improve causal variant identification.

10
Ensemble Approaches to Screening, Diagnosis, and Subtyping of Multiple Sclerosis

Yang, I. Y.; Patil, A.; Jin, O.; Loud, S.; Buxhoeveden, S.; Zhang, D. Y.

2026-04-21 genetic and genomic medicine 10.64898/2026.04.19.26351230 medRxiv
Top 0.9%
0.7%
Show abstract

Multiple sclerosis (MS) is a debilitating disease affecting more than 1 million Americans, and today is assessed primarily through magnetic resonance imaging (MRI) and observational clinical symptoms. Given the autoimmune nature of MS, we hypothesized that high-dimensional gene expression data from peripheral blood mononuclear cells (PBMCs), when analyzed with the assistance of AI, may collectively serve as valuable biomarkers for the real-time risk and progression of MS. Here, we present PBMC RNA sequencing (RNAseq) results from N=997 samples, including 540 MS, 221 neuromyelitis optica (NMO), and 149 healthy controls. We constructed and optimized ensemble models for three clinical outcomes: (1) discrimination of early MS (EDSS [&le;] 2.0) from healthy individuals with 74% AUC at 100% coverage, (2) differential diagnosis of MS from NMO with 91% AUC at 80% coverage, and (3) subtyping RRMS from progressive MS with 79% AUC at 80% coverage. To our knowledge, no prior molecular test has been reported for any of these three MS clinical tasks, and these results may have immediate impact on clinical management of MS patients. Two innovations that improved the stratification accuracy of our models: selection of gene sets based on expression variance in disease states, and use of non-linear rank sort and conviction weighting in the ensemble score calculation.

11
Widespread genetic effect heterogeneity impacts bias and power in nonlinear Mendelian randomization

Wang, J.; Morrison, J.

2026-04-20 epidemiology 10.64898/2026.04.17.26351133 medRxiv
Top 1%
0.5%
Show abstract

1Mendelian randomization (MR) uses genetic variants as instrumental variables to infer causal relationships between complex traits. Standard MR can be used to estimate an average causal effect at the population level, and typically assumes a linear exposure-outcome relationship. Recently, several methods for estimating nonlinear effects have been developed. However, many have been found to produce spurious empirical findings when subjected to negative control analyses. We propose that this poor performance may be attributable to heterogeneity in variant-exposure associations. We demonstrate that heterogeneous genetic effects on exposure lead to biased estimates, poor coverage, and inflated type I error in control function and stratification-based methods. In contrast, two-stage least squares (TSLS) methods are robust to such heterogeneity, but suffer from low precision and low power in some circumstances. We show that a statistical test for heterogeneity can be used to guide the choice of nonlinear MR methods. Using UK Biobank data, we reassess the causal effects of BMI, vitamin D, and alcohol consumption on blood pressure, lipid, C-reactive protein, and age (negative control). We find strong evidence of heterogeneity for all three exposures, and also recapitulate previous results that control function and stratification-based methods are prone to false positives. Finally, using nonparametric TSLS, we identify evidence of nonlinear causal effects of BMI on HDL cholesterol, triglycerides, and C-reactive protein; however, specific estimates of the shape of these relationships are imprecise. Altogether, our results suggest that common nonlinear MR methods are unreliable in the presence of realistic levels of heterogeneity, and that more methodological development is required before practically useful nonlinear MR is feasible.

12
CalPred yields calibrated intervals for polygenic risk prediction

Shi, Z.; Zhang, Z.; Mandla, R.; Hou, K.; Pasaniuc, B.

2026-04-22 genetic and genomic medicine 10.64898/2026.04.21.26351410 medRxiv
Top 1%
0.5%
Show abstract

Polygenic scores (PGS) have emerged as a useful biomarker for stratification of high-risk individuals in genomic medicine, with prediction intervals arising as a principled approach to incorporate statistical uncertainty in their individual-level predictions. In contrast to recent reports by Xu et al7, we show that CalPred6 provides well-calibrated prediction intervals that contain the trait phenotypes at targeted confidence levels. CalPred maintains calibration when PGS performance varies across contextual factors (e.g., ancestry, age, sex, or socio-economic factors) whereas PredInterval7 - a recently introduced method that focuses on marginal calibration across all individuals - exhibits miscalibration.

13
A variance QTL approach to uncover gene-fish oil supplement interaction loci for 14 circulating unsaturated fatty acid traits

Ihejirika, S. A.; Stephen, E.; Ye, K.

2026-04-20 genetic and genomic medicine 10.64898/2026.04.13.26350791 medRxiv
Top 1%
0.5%
Show abstract

Gene-environment interactions (GEI) contribute to circulating polyunsaturated fatty acid (PUFA) and monounsaturated fatty acid (MUFA) profiles. GEI may partly explain differences in trait variance across genotype groups. To identify GEI for circulating unsaturated fatty acids, we adopted a two-stage strategy. First, we detected quantitative trait loci associated with trait variance (vQTLs). Second, we tested these vQTLs for interaction with fish oil supplements (FOS). We performed genome-wide vQTL screens for 14 plasma PUFA and MUFA phenotypes in a UK Biobank subset of 200,478 participants. At the genome-wide significance threshold (p < 5.0 x 10-8), we identified 172 vQTL-trait pairs across all 14 traits, and 16 of these vQTLs had no marginal genetic effect on the corresponding trait. We found 46 non-overlapping loci across all phenotypes, with an average of 12 vQTLs per trait. Omega-6% and PUFA% had the most independent vQTLs (N = 24) while DHA% and Omega-3% had the least (N = 1 and 2, respectively). For each of the 172 vQTL-trait pairs, we tested the interaction effect of the vQTL with FOS on the corresponding trait. We found six significant interaction signals in DHA, DHA%, Omega-3, Omega-3%, LA, and Omega-6/Omega-3 ratio around the FADS1/2, ZPR1, and SUGP1/TM6SF2 genes. Our results provide a comprehensive resource of vQTLs and gene-FOS interactions shaping the circulating levels of unsaturated fatty acids.

14
Duplication within 14q32.13 implicates a chimeric CLMN::SYNE3 RNA transcript in cerebellar ataxia

Litster, T. M.; Wilcox, R. A.; Carroll, R.; Gardner, A. E.; Nazri, N. M.; Shoubridge, C. A.; Delatycki, M. B.; Lohmann, K.; Agzarian, M.; Turella Divani, R.; Rafehi, H.; Scott, L.; Monahan, G.; Lamont, P. J.; Ashton, C.; Laing, N. G.; Ravenscroft, G.; Bahlo, M.; Haan, E.; Lockhart, P. J.; Friend, K. L.; Corbett, M. A.; Gecz, J.

2026-04-24 genetic and genomic medicine 10.64898/2026.04.23.26350376 medRxiv
Top 1%
0.5%
Show abstract

The spinocerebellar ataxias (SCAs) are a clinically heterogenous group of neurodegenerative disorders that affect movement, vision, speech and balance. Here, we reassign the linkage of SCA30 to 14q32.13 based on a cumulative LOD score >12. Within this interval we identified a 331 kb duplication, absent in population controls and not observed in >800 unrelated individuals with genetically unresolved cerebellar ataxia. RNASeq analysis of patient-derived lymphoblastoid cell lines revealed a splice-mediated chimeric transcript resulting from the duplication event. This transcript joined exon 1 of CLMN to exon 2 of SYNE3. In silico translation predicted that this chimeric transcript would produce a short N-terminal peptide corresponding to exon 1 of CLMN and the usually untranslated region of exon 2 of SYNE3 fused to the complete and in-frame SYNE3 protein. Transient overexpression of SYNE3 or the CLMN::SYNE3 fusion protein, in both HeLa cells and mouse primary cortical neurons, resulted in equivalent cellular outcomes including altered nuclear morphology and chromosomal DNA fragmentation. SYNE3 forms part of the linker of nucleoskeleton and cytoskeleton complex and is not usually expressed in cerebellar Purkyn[e] neurons while, CLMN has a Purkyn[e] specific expression pattern within the brain. Our data suggests that ectopic expression of SYNE3 in cerebellar Purkyn[e] neurons, mediated by the CLMN promoter, leads to cerebellar atrophy and causes spinocerebellar ataxia in the SCA30 family. This is an example of Mendelian disease arising from a novel, chimeric transcript with a likely dominant negative effect. Chimeric transcripts are commonly associated with cancers, but they are not often associated with monogenic disorders. Detection of chimeric transcripts as part of structural variant analysis could increase the genetic diagnostic yield of Mendelian disorders.

15
Rare protein-disrupting variants in NPY5R, DLGAP1 and MAPK8IP3 segregate with OCD in two multiplex pedigrees potentially implicating energy homeostasis and post-synaptic signalling in molecular etiology.

Ormond, C.; Cap, M.; Chang, Y.-C.; Ryan, N.; Chavira, D.; Williams, K.; Grant, J. E.; Mathews, C.; Heron, E. A.; Corvin, A.

2026-04-22 psychiatry and clinical psychology 10.64898/2026.04.21.26350600 medRxiv
Top 1%
0.4%
Show abstract

Obsessive compulsive disorder (OCD) is significantly heritable, but only a fraction of the contributory genetic variation has been identified, and the molecular etiology involved remains obscure. Identifying rare contributory variants of large effect would be an important milestone in helping to elucidate the mechanisms involved. Analysis of densely affected pedigrees is a potentially useful strategy to bypass the sample size challenges of standard case-control approaches. Here we performed whole genome sequencing (WGS) of 25 individuals across two multiplex OCD pedigrees. We prioritised rare variants using a Bayesian inference approach which incorporates variant pathogenicity and co-segregation with OCD. In the first pedigree, we identified a highly deleterious missense variant in NPY5R, carried by the majority of affected individuals. This gene is brain-expressed and has previously been implicated in panic disorder and internet addiction GWAS studies. In the second pedigree, we identified a large deletion of DLGAP1 and a missense variant in MAPK8IP3, that perfectly co-segregated in a specific branch of the family: both genes have previously been implicated in OCD and autism. Both genes contribute to a protein interaction network including ERBB4 and RAPGEF1 which we had previously identified in a large Tourette Syndrome pedigree. Our analysis suggests that both energy homeostasis and downstream signalling from the post-synaptic density may both be important avenues for future research.

16
Sex stratified analyses enable new genetic insights into brain imaging phenotypes

Zhang, N.; Wang, S.; Fu, J.; Ji, Y.; Liu, N.; Qian, Q.; Xue, H.; Ding, H.; Liang, M.; Qin, W.; Xu, J.; Yu, C.

2026-04-21 genetics 10.64898/2026.04.21.719541 medRxiv
Top 1%
0.4%
Show abstract

Sex differences are commonly observed in neuroimaging phenotypes and in the risk of brain diseases, yet the underlying genetic mechanisms remain poorly understood. We investigated sex differences in the genetic architecture of 805 neuroimaging phenotypes in 22,950 males and 22,950 females matched for sample size and covariates, and systematically compared sex-stratified with sex-combined genetic analyses. We found eight variant-trait associations with significant sex differences, 235 fine-mapped sex-dominant causal associations, 457 sex-dominant colocalizations with sex hormones, and 96 sex-dominant colocalizations with schizophrenia. Compared with sex-combined analysis, sex-stratified analysis identified 47 new genetic associations, 170 new fine-mapped causal associations, 1,019 new colocalizations with sex hormones, and 191 new colocalizations with schizophrenia. Additionally, sex-stratified analysis improved global heritability and genetic-correlation estimates and enhanced polygenic prediction for certain phenotypes. This work highlights the need to routinely perform sex-stratified genetic association analyses to elucidate sex-specific and sex-shared genetic control of neuroimaging phenotypes and related disorders.

17
Drug-Target Mendelian Randomization and Imaging Mediation Analyses Reveal Therapeutic Targets and Causal Mechanisms for Cardiomyopathies

Wang, P.; Song, Y.; Zhang, B.; Yang, J.

2026-04-22 cardiovascular medicine 10.64898/2026.04.20.26351344 medRxiv
Top 2%
0.4%
Show abstract

Abstract Background: Hypertrophic (HCM) and dilated (DCM) cardiomyopathy constitute the principal phenotypes of primary cardiomyopathy, yet both lack sufficient therapeutic options. Integrating genetic insights with detailed cardiac phenotyping offers a promising strategy to prioritize targets and elucidate their mechanisms of action. Methods: We conducted an three-stage analysis. First, drug-target Mendelian randomization (MR) was performed using cis-acting protein (pQTL) and expression (eQTL) quantitative trait loci as genetic instruments for potential drug targets. Second, we examined causal associations between 82 cardiac magnetic resonance (CMR)-derived imaging traits and HCM/DCM risk in a CMR-based MR analysis. Third, mediation MR was employed to quantify the proportion of the genetic effect of prioritized drug targets on cardiomyopathy risk that was mediated through specific CMR phenotypes. Results: Our analyses identified 19 and 13 potential therapeutic targets for HCM and DCM, respectively. CMR-based MR revealed that HCM risk was causally associated with increased right ventricular ejection fraction (RVEF) and greater left ventricular wall thickness, whereas DCM risk was linked to ventricular dilation, impaired myocardial strain, and altered aortic dimensions. Critically, mediation analysis established that these CMR traits served as significant intermediate pathways. The protective effect of ALPK3 on HCM risk was mediated through a reduction in myocardial wall thickness. Conversely, the effects of PDLIM5, HSPA4, and FBXO32 on DCM risk were exerted in part via alterations in aortic dimensions. Conclusion: This integrative genetic and imaging study systematically identify candidate therapeutic targets for HCM and DCM and delineates the specific CMR phenotypes through which they likely exert their causal effects. Our findings advance the understanding of disease pathogenesis and highlight new possibilities for improving the diagnosis and management of cardiomyopathy.

18
De novo EHMT2 variants cause an autosomal dominant EHMT2-related Kleefstra syndrome via loss of G9a methyltransferase activity.

Hnizda, A.; Martinez-Delgado, B.; Sanchez-Ponce, D.; Alonso, J.; Amiel, J.; Attie-Bitach, T.; Bada-Navarro, A.; Baladron, B.; Bermejo-Sanchez, E.; Brinsa, V.; Bukova, I.; Cazorla-Calleja, R.; Cervenkova, S.; Chow, S.; Dusek, P.; Fedosieieva, O.; Fernandez-Prieto, M.; Ghosh, S.; Gomez-Mariano, G.; Gregorova, A.; Hamilton, M. J.; Hartmannova, H.; Hernandez-San Miguel, E.; Herrero-Matesanz, M.; Hodanova, K.; Kadek, A.; Kerkhof, J.; Kleefstra, T.; Lacombe, D.; Levy, M. A.; Lopez-Martin, E.; Lyse, R.; Man, P.; Marin-Reina, P.; Macnamara, E. F.; McConkey, H.; Melenovska, P.; Mielu, L. M.; Moore, D.;

2026-04-20 genetics 10.1101/2025.09.25.678439 medRxiv
Top 2%
0.3%
Show abstract

EHMT1 and EHMT2 genes encode human euchromatin histone lysine methyltransferase 1 and 2 (EHMT1 alias GLP; EHMT2 alias G9a) that form heteromeric GLP/G9a complexes with essential roles in epigenetic regulation of gene expression. While EHMT1 haploinsufficiency has been established as the cause of Kleefstra syndrome 1, the pathogenesis of G9a dysfunction in human disease remains largely unknown. We identified seven de novo EHMT2 variants in patients with clinical presentation, episignatures, histone modifications and transcriptomic profiles similar to those of Kleefstra syndrome 1. In vitro studies revealed that these variants encode for structurally stable G9a proteins that are catalytically incompetent due to aberrant interactions either with histone H3 tail or with S-adenosylmethionine. Heterozygous mice carrying a patient-derived variant exhibited growth retardation, facial/skull dysmorphia and aberrant behavior. Here we report pathogenic EHMT2 variants that likely exert dominant-negative effect on GLP/G9a complexes and thus genocopy the EHMT1 haploinsufficiency via a distinct molecular mechanism, defining an autosomal dominant EHMT2-related Kleefstra syndrome.

19
Biventricular cardiac dynamic shape: genetics and cardiometabolic disease associations

Burns, R.; Young, W. J.; Uddin, K.; Petersen, S. E.; Ramirez, J.; Young, A. A.; Munroe, P. B.

2026-04-20 genetic and genomic medicine 10.64898/2026.04.19.26350940 medRxiv
Top 2%
0.3%
Show abstract

BackgroundGenetic studies using cardiac magnetic resonance (CMR) imaging have identified loci related to cardiac shape, but most focus on static morphology. The value of a dynamic cardiac shape atlas capturing both shape and function remains unknown. MethodsA dynamic shape atlas comprising CMR-derived shape models at end-diastole and end-systole was combined with genetic and outcome data in 36,992 UK Biobank participants. Dynamic shape principal components (PCs) describing >1% of variance were characterized, and tested for associations with prevalent and incident cardiometabolic diseases, including ischemic heart disease (IHD), heart failure (HF), significant atrioventricular block (AVB), and atrial fibrillation (AF), and independent predictive power alongside standard CMR measures. Genome-wide association studies (GWAS) were performed to identify candidate genes and biological pathways, and polygenic risk scores (PRS) were assessed for disease associations. Mendelian randomization (MR) was performed to test causality of observed disease associations. ResultsWe identified 14 dynamic cardiac shape PCs capturing 83.3% of total dynamic cardiac shape variance. These PCs captured distinct functional remodeling patterns such as variation in annular plane systolic excursion, while remaining only modestly correlated with standard CMR measures. All 14 PCs were associated with at least one incident cardiometabolic disease, with the strongest associations observed for incident IHD, HF, and AVB. Notably, incorporating dynamic shape PCs improved the prediction of incident IHD beyond standard CMR measures. GWAS identified 75 genetic loci associated with dynamic shape, including 14 variants previously unreported for cardiac traits, and candidate genes demonstrated enrichment in pathways related to cardiac development and contractile function. PRS derived from dynamic shape loci were significantly associated with multiple outcomes, most prominently HF. MR identified significant causal relationships between several PCs and cardiometabolic disease. ConclusionsDynamic cardiac shape features capture aspects of cardiac structure and function not fully represented by standard CMR measures. These features are strongly associated with incident cardiometabolic disease and provide new insights into the genetic architecture of cardiac remodeling. Clinical perspectiveO_ST_ABSWhat is new?C_ST_ABSO_LIGenetic and outcome relationships with a dynamic statistical shape model capturing both left and right ventricles at end-diastole and end-systole. C_LIO_LIDemonstration of incremental value over existing cardiac shape models, through capture of functional remodeling not represented by standard imaging measures. C_LIO_LIIdentification of genetic susceptibility loci for dynamic cardiac shape, including 14 variants not previously reported for cardiac traits. C_LI What are the clinical implications?O_LIThe results enhance our understanding of the genetic architecture of dynamic cardiac shape and function in the general population and clarify their relationships with other cardiovascular endophenotypes and incident cardiometabolic diseases. C_LIO_LINewly identified candidate genes expand the biological pathways implicated in cardiac remodeling and provide targets for future functional and mechanistic studies. C_LIO_LIThe improved prediction of incident cardiometabolic disease, particularly ischemic heart disease, achieved by adding dynamic shape PCs to traditional CMR measures suggests potential value for their inclusion in evaluation of patients. C_LI

20
Methylation profiling in the Million Veteran Program: design, quality control, and smoking-associated epigenetic signatures

Schreiner, P. A.; Markianos, K.; Francis, M.; Despard, B.; Gorman, B. R.; Said, I.; Dong, F.; Gautam, S.; Dochtermann, D.; Shi, Y.; Devineni, P.; Kirkpatrick, C.; Khazanov, N.; Moser, J.; Million Veteran Program, ; Huang, G. D.; Muralidhar, S.; Tsao, P. S.; Pyarajan, S.

2026-04-23 genetic and genomic medicine 10.64898/2026.04.22.26351491 medRxiv
Top 2%
0.3%
Show abstract

The Million Veteran Program (MVP) represents the largest and one of the most diverse single cohorts associated with longitudinal Electronic Health Record data (EHR) data. We profiled a subset of samples from MVP using the Illumina Infinium MethylationEPIC Beadchip (EPIC array) to generate one of the largest single cohort methylation dataset to-date. Methylation profiles were analyzed for 45,460 total individuals, with the most populous ancestries composed of 27,455 Europeans, 11,798 African Americans, and 4,859 Admixed Americans. We detail the strict quality control standards implemented to ensure the most robust method of methylation profiling of the MVP cohort. This dataset was then applied to evaluate the effects of smoking exposure on DNA methylation in MVP participants. Ancestry-stratified epigenome-wide association studies (EWAS) of smoking status (ever/never) were performed using over 750,000 probes with certifiable signal. Our multi-ancestry meta-analysis demonstrates replicability with existing EWAS and identifies 3,207 novel probe-smoking associations unlocked via the depth and breadth of data in this cohort.